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THE  POTENTIAL  FOR  ADVANCED  COMPUTERISED  AIDS  FOR 
COMPREHENSIBLE  WRITING  OF  TECHNICAL  DOCUMENTS 
David  E.  Kieras 
University  of  Michigan 


It  is  generally  agreed  that  most  technical  manuals  for 
military  equipment  are  not  very  comprehensible  and  thus  tend  to  be 
unused  (Bond  and  Towne,  1979)-  Figure  1  is  an  excerpt  from  a 
typical  military  equipment  manual.  One  can  see  that  the  sentence 
structure  is  often  convoluted,  even  though  this  manual  is  an 
important  and  mature  document.  Since  many  users  of  technical 
documents  are  relatively  poor  readers,  especially  in  the  military, 
this  lack  of  clarity  is  a  serious  impediment  to  using  the  manuals, 
even  if  the  reader  has  adequate  background  knowledge.  Thus 
improving  the  comprehensibility  of  such  documents  is  largely  a 
matter  of  improving  the  clarity  of  the  writing,  rather  than 
changing  the  content.  This  paper  describes  an  approach  to 
developing  advanced  computer-based  systems  that  will  assist 
writers  in  preparing  more  comprehensible  technical  documents. 

THE  NEED  FOR  COMPUTERISED  WRITING  AIDS 

For  many  years  there  have  been  guidelines  available  that  are 
intended  to  help  technical  writers  write  in  a  more  comprehensible 
fashion.  Despite  the  long  availability  of  guidelines,  the  quality 
of  technical  documentation  is  still  in  need  of  substantial 
improvement;  why  don't  guidelines  help?  Of  course  there  are  many 
problems,  some  essentially  political,  which  make  it  difficult  to 
bring  about  a  fundamental  change  in  how  documentation  for 
equipment  is  prepared.  However,  one  possibility  for  why 
guidelines  have  been  ineffective  could  be  due  to  the  psychological 
properties  of  the  writing  and  editing  tasks. 

Some  studies  conducted  by  Wright  (in  press)  suggest  that 
correcting  text  according  to  a  set  of  guidelines  is  in  fact  a  very 
difficult  and  complex  skill.  She  found  that  there  was  little 
consistency  between  professional  editors  in  their  evaluations  and 
revisions  of  a  manuscript.  But  since  technical  documents  are 
often  written  by  individuals  with  no  formal  training  in  writing, 
Wright's  studies  involving  ordinary  people  as  subjects  are 
especially  important.  One  group  was  given  a  set  of  six 
guidelines,  with  examples,  and  asked  to  revise  a  short  passage 
with  the  guidelines  in  hand.  The  writing  guidelines  covered 
several  important  and  commonly  accepted  features  of  clear  writing, 
such  as  avoiding  long  modifying  strings,  passive  verbs,  and 
unnecessarily  long  words.  Another  group  performed  the  same  task 
but  without  the  guidelines. 


2-4-3.  PRIMARY  POWER  MODE.  The  primary 
power  mode  is  a  cage  mode  wherein  initial  applica¬ 
tion  of  power  to  SINS  is  accomplished.' The  primary 
power  mode  is  entered  when  the  PRIMARY 
POWER  (MODE  SELECTOR)  pushbutton  of  the 
NCCP  is  pressed.  During  the  primary  power  mode, 
the  platform  is  coarse  leveled  by  the  pendulous 
leveling  resolvers  and. coarse  aligned  in  azimuth  by 
the  DEPTH  and  HEADING  data  converter  monitor 
drawer.  The  platform  will  drive  to  the  indicated 
heading  when  a  cage  mode  is  selected.  The  platform 
temperature  alarm  circuits  are  activated,  causing  the 
platform  temperature -alarm  lamp  to  flash  until  the 
binnacle  temperature  is  within  its  operating  range. 
The  gyro  bottoming  circuits  and  alarms  are  deacti¬ 
vated.  The  velocity  meter  and  gyro  pump  power 
supply  is  turned  on.  The  power  relays  in  the  naviga¬ 
tion  console  connect  1 1 5v  400-Hz  3-phase  power  to 
the  SINS  power  supplies  and  1 1 5v  60-Hz  3-phase 
power  to  the  SINS  blowers.  MARDAN  memory 
precision  power  is  also  applied  in  the  primary 
power  mode. 


Figure  1.  An  excerpt  from  a  military  equipment  manual. 


The  guidelines  did  have  an  effect,  in  that  roughly  twice  as 
many  modifications  to  the  text  were  made  by  the  subjects  with  the 
guidelines.  However,  even  the  subjects  with  the  guidelines  made 
only  39#  of  the  changes  to  the  text  that  the  guidelines  addressed. 
Thus,  the  majority  of  the  writing  problems  in  the  text  were  left 
unchanged  even  by  those  subjects  who  had  the  guidelines  before 
them  at  all  times. 

This  is  a  startling  result;  it  suggests  that  guidelines  do 
not  help  because  it  is  very  difficult  to  detect  problems  in 
writing.  But  this  follows  from  the  currently  accepted  idea  that 
much  of  the  reading  process  is  very  highly  automated.  In  order  to 
spot  a  comprehensibility  problem,  an  ordinary  reader  would  have  to 
notice  that  the  normally  subconscious  automatic  reading  mechanisms 
were  having  difficulty.  Since  the  person  responsible  for  writing 
a  technical  document  is  likely  to  be  a  very  skilled  reader,  and  be 
familiar  with  the  subject  matter,  he  or  she  will  have  no  problems 
comprehending  the  material,  even  if  it  presents  serious  problems 
for  the  naive  or  poor  reader.  The  result  is  that  writers  will 
fail  to  detect  most  of  the  comprehensibility  problems  in  their 
work.  People  who  are  good  copy  editors  have  probably  developed 
very  specialized  skills,  such  as  being  able  to  monitor  their 
comprehension  processes,  or  read  in  some  sort  of  non-automatic 
mode . 


Thus  efforts  to  improve  the  comprehensibility  of  technical 
documents  by  providing  guidelines  will  probably  continue  to  be 
unsuccessful,  although  systematic  training  in  writing  may 
gradually  increase  the  pool  of  good  technical  writers. 

A  computerized  writing  aid  that  detects  comprehensibility 
problems  would  clearly  be  an  advantage  simply  because  it  would 
perform  the  detection  problem  much  more  rapidly  and  reliably  than 
most  human  writers  could  do.  This  is  a  difficult  task  for  a 
writer,  because  it  involves  undoing  or  modifying  a  very  highly 
developed  and  automated  skill  of  reading.  However,  if  a  computer 
can  perform  at  least  most  of  this  process,  it  would  free  the 
writer  to  exercise  his  or  her  writing  skill  in  correcting  the 
problems  once  they  were  detected.  The  correction  process  is  still 
in  the  domain  of  skills  which  are  only  humanly  possible.  However, 
the  detection  process  is  within  reach  of  current  computer 
technology . 

CURRENT  WRITING  AIDS 

Two  major  computerized  writing  aids  already  exist.  One  is 
the  Computerized  Readability  Editing  System  (CRES)  developed  by 
Kincaid  and  his  co-workers  (Kincaid,  Aagard ,  &  O'Hara,  1980; 
Kincaid,  Aagard,  O'Hara,  &  Cottrell,  1980;  Kincaid,  Cottrell, 
Aagard,  &  Risley,  1981;  ).  The  other  is  the  Writers  Work  Bench 
(WWB)  developed  by  Bell  Laboratories  (Cherry,  1982;  Macdonald, 
Prase,  Gingrich,  &  Keenan,  1982). 
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Both  CREo  and  WWB  are  intended  to  be  used  on  a  computer  as 
part  of  a  general  word  processing  and  document  preparation 
package.  After  preparing  a  draft  of  a  document,  the  writer  feeds 
it  into  the  system,  and  obtains  output  about  the  quality  of  the 
writing.  The  CRES  system  provides  an  annotated  copy  of  the 
original  document,  with  specific  problems  pointed  out,  and  some 
global  information,  consisting  of  the  Kincaid-Flesch  readability 
score,  and  a  list  of  the  words  appearing  in  the  text  that  are  not 
on  the  standard  military  vocabulary  list.  The  specific  feedback 
consists  of  several  useful  items.  Sentences  of  excessive  length 
are  flagged,  along  with  the  number  of  words  in  the  sentence.  The 
use  of  the  passive  voice  is  pointed  out,  along  with  strings  of 
words  that  involve  too  many  prepositions,  which  are  often 
associated  with  awkward  phrases.  Simpler  wording  is  pointed  out. 
For  example,  use  is  suggested  as  a  replacement  for  utilize . 

The  WWB  system  is  actually  a  family  of  programs  that  are 
based  on  an  ingenious  algorithm  that  can  classify  words  in  a  text 
according  to  their  parts  of  speech  using  very  little  lexical 
information.  However,  the  feedback  provided  to  the  writer  by  WWB, 
at  least  as  described  by  Cherry  and  Macdonald,  et.  al . ,  seems  to 
be  no  better  than  the  CRES  system,  and  in  some  ways  worse.  The 
basic  theme  of  WWB  seems  to  be  providing  global  statistical 
information  about  a  document,  rather  than  exact  criticism.  For 
example,  one  program  provides  the  scores  for  several  readability 
formulas,  along  with  such  statistics  as  the  proportion  of  words 
appearing  as  various  parts  of  speech,  such  as  what  percentage  of 
the  text  are  adjectives.  The  guidance  for  how  such  information 
should  be  used  is  of  dubious  value;  for  example,  Cherry  (1982) 
suggests  that  if  the  ratio  of  adjectives  to  nouns  is  excessive,  it 
is  a  sign  of  poor  writing.  Another  program  compares  the 
statistics  for  a  document  with  those  for  one  that  has  been  chosen 
to  represent  good  documents  of  that  type.  For  example,  the 
program  will  inform  the  writer  of  an  interoffice  memo  that  his  or 
her  memo  has  more  uses  of  the  passive  voice  than  a  good 
interoffice  memo.  Another  program  flags  problems  in  a  manner 
similar  to  the  CRES  system,  but  does  not  appear  to  be  as 
comprehensive . 

PROBLEMS  WITH  CURRENT  SYSTEMS 

The  fundamental  problem  with  both  CRES  and  WWB  is  that  they 
are  not  based  on  the  actual  psychology  of  comprehension,  but 
rather  on  ordinary  writer's  intuitions ,  many  of  which  are 
incorrect  or  inapplicable  in  terms  of  what  is  known  about 
comprehension.  These  intuitions,  which  inspire  the  guidelines, 
seem  to  be  based  both  on  ideas  about  what  is  clear  writing,  and 
also  artistic  customs  about  what  constitutes  good  literary  style. 
For  example,  many  textbooks  on  writing  will  recommend  that  one  use 
variety  in  sentence  length,  and  variety  in  forms  of  reference  as 
well.  Thus,  in  the  paper  on  WWB  by  Cherry  (1982),  explicit 
recommendations  are  made  to  use  the  statistics  provided  by  WWB  to 
increase  the  variety  in  one's  writing.  However,  according  to  the 
psychological  work  on  comprehension,  variety  in  reference  may 
easily  produce  problems  for  the  reader,  as  will  be  described  more 


below,  and  variety  of  sentence  length  in  itself  has  no  particular 
value. 

Many  of  these  literary  customs  are  apparently  intended  to 
maintain  the  reader's  interest.  However,  it  is  reasonable  to 
assume  that  the  reader  of  a  technical  document  does  not  have  an 
interest  problem.  Such  a  reader  is  neither  a  classroom  student, 
struggling  to  keep  awake  while  reading  boring  material,  nor  a 
casual  reader  hoping  to  be  entertained.  Rather,  the  reader  of  a 
technical  document  needs  to  get  the  information  out  of  the 
document  as  quickly  and  efficiently  as  possible,  so  that  he  or  she 
can  complete  the  task  at  hand.  Certainly,  the  reason  why 
technical  documents  are  underused  is  not  that  they  are  boring,  but 
that  thej  are  inefficient  information  sources. 

A  secondary  problem  is  that  both  systems  use  fairly  simple 
algorithms;  neither  of  them  process  the  input  to  any  depth. 
Doing  any  extensive  processing,  along  the  lines  of  an  artificial 
intelligence  natural  language  system,  is  out  of  the  question, 
since  these  systems  run  on  small  machines,  such  as  PDP-lls.  But 
by  using  the  newer  and  much  more  powerful  machines  now  appearing 
on  the  market  as  professional  work  stations,  it  should  be  possible 
to  device  much  more  sophisticated  writing  aids. 

A  NEW  APPROACH 

A  new  approach  to  computerized  writing  aids  is  based  on  using 
the  results  and  theory  from  the  research  literature  on 
comprehension  to  specify  what  problems  the  system  should  detect. 
By  using  techniques  from  artificial  intelligence  for  the 
processing  of  natural  language,  the  greater  sophistication  can  be 
achieved . 

Research  in  modern  psycholinguistics  has  about  a  twenty-year 
history.  Of  the  many  topics  that  have  been  studied,  only  a 
portion  of  the  work  is  relevant  to  improving  the  comprehensibility 
of  text,  but  this  still  leaves  roughly  200  relevant  studies  in  the 
literature.  Most  of  these  deal  with  individual  isolated 
sentences,  but  some  of  the  newer  work  deals  with  passage  structure 
or  groups  of  sentences  and  their  relations. 

Comprehension  research  results 

Examples  of  comprehensibility  results .  Table  1  presents  some 
examples  of "comprehensibility  rules  that  can  be  proposed  based  on 
some  of  the  results  in  the  literature.  The  first  rule  is  of 
course  a  familiar  result,  and  is  used  in  both  CRES  and  WWB.  The 
second  rule  is  a  newer  idea,  and  can  be  motivated  both  empirically 
and  in  terms  of  theoretical  considerations  of  the  information 
processing  that  is  required  to  determine  the  referent  of  a 
pronoun . 

The  third  and  fourth  rules  are  good  examples  of  how  the 
research  can  correct  standard  writer's  wisdom.  CRES  recommends 
that  relative  pronouns  be  deleted,  apparently  because  they 


Table  1 

Examples  of  Comprehensibility  Rules 
from  the  Psycholinguistics  Literature 


1.  Active  is  better  than  passive. 

(Tannenbaum  &  Williams,  1968) 


2.  A  pronoun  should  refer  to  the  subject  of  the  previous 
sentence . 

(Prederiksen,  1979) 


3.  Relative  clauses  should  begin  with  a  relative  pronoun. 
(Hakes  &  Foss,  1970) 


.  .  If  the  topic  of  the  passage  is  the  logical  object, 
then  passive  is  better  than  active. 

(Perfetti  &  Goldman,  1974,  1975) 


5.  Temporarily  changing  the  subject  impedes  processing. 
(Lesgold,  Roth,  Curtis,  &  Riley,  1979) 


6.  Refer  to  an  object  in  the  same  way  as  it  was  previously 
referred  to;  even  a  synonym  slows  processing. 

(Yekovich,  &  Walker,  1978) 


7.  Refer  to  an  object  that  was  either  explicitly  mentioned 
previously,  or  is  strongly  implied  by  the  previous  text. 
(Haviland  &  Clark,  1974) 


8.  Indefinite  determiners  should  be  used  only  on  textually 
new  items. 

(de  Villiers,  1974) 


9«  Connective  words  (e.g.,  however)  improve  comprehension. 
(Haberlandt  &  Kennard,  1981) 
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increase  the  length  of  the  sentence.  However,  the  empirical  work 
shows  that  at  least  under  some  conditions,  sentences  with  relative 
pronouns  are  easier  to  understand  than  those  without;  the  pronoun 
marks  the  beginning  of  a  relative  clause,  and  so  can  relieve  some 
local  parsing  ambiguity.  The  next  rule  corrects  the  blanket 
condemnation  of  the  passive  voice  appearing  in  many  writing 
guides.  The  research  shows  that  the  passive  voice  has  the 
important  function  of  allowing  the  surface  subject  of  a  sentence 
to  be  the  same  as  the  topic  of  discourse. 

The  fifth  rule  is  an  example  of  recent  work  done  on  the  role 
of  topic  information  during  comprehension.  Changing  the  topic  of 
discourse  is  legitimate,  but  if  it  is  not  justified,  the  reader 
will  be  misled  and  slowed  down. 

The  sixth  rule  directly  contradicts  the  recommendation  that 
the  writer  use  variety  in  how  things  are  referred  to.  Such 
variety  in  reference  actually  costs  extra  processing.  This  has 
not  been  studied  very  directly,  but  there  are  some  clear 
conclusions,  such  as  the  fact  that  even  the  use  of  a  synonym  can 
slow  down  processing.  In  technical  documentation,  this  issue 
could  be  very  important.  Often  there  are  objects  that  are  very 
similar  to  each  other,  but  are  distinguished  only  by  the  modifiers 
appearing  in  the  noun  phrase.  For  example,  a  device  might  have 
many  relays,  which  are  distinguished  by  phrases  such  as  antenna 
excursion  limiting  cutout  relay,  and  magnetron  anode  current  limit 
relay.  Perhaps  in  such  a  context  there  should  be  no  variety  in 
reference  at  all. 

A  related  issue  is  addressed  in  the  seventh  rule.  In  the 
normal  course  of  comprehension,  the  writer  and  reader  have  a  tacit 
contract  that  the  writer  will  not  refer  to  objects  in  ways  that 
the  reader  cannot  match  up  with  previously  mentioned  or  known 
objects.  This  means  that  the  writer  should  ensure  that  the  reader 
can  easily  determine  what  object  is  being  referred  to.  If  a 
reference  is  made  to  a  previously  mentioned  object  in  an  obscure 
fashion,  then  the  reader  must  take  extra  time  and  effort  to 
resolve  the  reference. 

An  interesting  result  by  de  Villiers  leads  to  the  eighth  rule 
in  the  Table.  This  rule  would  be  rarely  violated  even  in  poor 
writing,  but  it  does  serve  as  an  example  of  the  kind  of 
consideration  that  emerges  clearly  from  even  simple 
psycholinguistics  work,  but  which  is  not  treated  at  all  in 
standard  writing  textbooks.  de  Villiers  discovered  that  simply 
changing  all  of  the  appearances  of  the  definite  determiner  the  in 
a  simple  passage  to  the  indefinite  determiner  a  will  cause  the 
reader  to  switch  from  interpreting  the  passage  as  a  connected 
story  to  viewing  it  as  a  set  of  isolated  sentences.  Apparently, 
for  most  readers,  the  indefinite  determiner  acts  as  a  extremely 
strong  signal  for  a  textually  new  item.  In  contrast,  the  definite 
determiner  is  more  ambiguous  in  its  function  (see  Kieras,  in 
press) . 


The  last  example  rule  listed  in  Table  1  concerns  connective 
words  like  however  and  therefore .  These  words  influence 
readability  formulas  "Because  they  can  increase  the  length  of 
sentences.  Connective  words  like  although  compound  this  problem 
because  they  require  a  much  more  complicated  sentence  structure. 
However,  these  words  should  reduce  the  amount  of  processing 
effort,  because  they  explicitly  specify  the  logical  relationship 
between  the  previous  ideas  in  the  passage  and  the  idea  that 
follows.  If  the  connective  word  is  missing,  then  the  reader  is 
put  in  the  position  of  having  to  infer  the  relation,  thereby 
taking  extra  time  and  effort. 

This  sample  of  results  is  by  no  means  complete  and 
exhaustive.  A  large  scale  review  of  the  comprehensibility  results 
in  the  psycholinguistics  literature  can  be  found  in  Kieras  & 
Dechert  (in  preparation).  But  these  examples  illustrate  how 
actual  empirical  and  theoretical  results  argue  against  many 
aspects  of  conventional  writer’s  wisdom,  but  at  the  same  time  give 
extremely  specific  suggestions  on  how  to  structure  text  so  that  it 
is  more  comprehensible. 

Limitations  of  the  research  literature.  There  are  certain 
limitations  of  the  research  literature.  First  of  all,  the 
psycholinguistics  studies  rarely  combine  two  or  more  structural 
features,  so  there  is  little  information  on  what  writing  problems 
are  the  most  serious  ones,  or  how  they  interact  with  each  other. 
Furthermore,  much  of  the  work  has  been  done  in  the  context  of 
isolated  sentences.  While  this  is  convenient  experimentally,  it 
is  quite  rare  that  the  reader  of  technical  documentation  must 
process  single  isolated  sentences.  Finally,  there  are  many  issues 
of  great  importance  in  technical  documentation  that  have  not  been 
considered  in  the  psycholinguistics  literature.  A  good  example  is 
the  effect  of  inconsistent  terminology.  In  contrast,  there  are  a 
great  number  of  studies  comparing  self-embedded  constructions  to 
right-branching,  even  though  self-embedded  constructions  of  any 
depth  are  rare. 

The  conclusion  is  that  given  the  spotty  empirical  coverage, 
it  is  important  to  apply  theoretical  ideas  about  comprehension,  as 
well  as  empirical  results,  to  the  design  of  an  advanced  writing 
aid  . 

Comprehension  theory 

The  theory  of  comprehension  has  been  very  well  developed  in 
the  last  ten  years.  Information  processing  models  for 
comprehension  have  been  elaborated  to  the  point  of  being  expressed 
in  the  form  of  computer  simulation  models  that  rigorously  specify 
many  of  the  processes  and  structures  involved  in  comprehension. 
These  models  were  based  closely  on  the  work  on  natural  language 
processing  being  done  in  the  field  of  artificial  intelligence. 
Some  representative  simulation  theories  of  comprehension  appear  in 
Kintsch  and  van  Dijk  (1978),  Thibadeau,  Just,  and  Carpenter 
(1982),  and  Kieras  (1981,  1983)-  These  theories  are  elaborate 
enough,  and  have  been  compared  in  enough  detail  to  data,  that  they 


can  be  used  as  comprehensive  descriptions  of  the  processes  that 
the  reader  must  perform  in  order  to  comprehend  text.  Thus  they 
provide  a  starting  point  for  defining  the  processes  that  an 
advanced  writing  aid  should  perform. 

A  survey  of  the  theoretical  literature  is  not  possible  in 
this  paper.  However,  a  good  summary  of  the  general  theoretical 
framework  can  be  provided  (cf.  Kintsch,  1977).  and  related  to  the 
types  of  possible  comprehensibility  problems  (see  also,  Kintsch  & 
Vipond,  1979;  Miller  &  Kintsch,  1980).  The  process  of 
comprehension  involves  several  stages,  which  are  performed 
sequentially  to  a  great  extent,  but  also  interact  heavily.  Each 
of  these  stages  can  be  related  to  sources  of  comprehension 
difficulty.  The  first  stage  is  word  identification ,  which  matches 
the  visual  pattern  of  a  printed  word  to  an  entry  in  the  reader's 
internal  lexicon.  Some  of  the  impediments  to  comprehension  at 
this  stage  are  very  well  understood,  such  as  the  presence  of 
unknown  or  low-frequency  words.  The  next  stage  performs  syntactic 
analysis  of  the  sentence,  deriving  the  relationships  of  the  words 
to  each  other.  Impediments  to  comprehension  here  will  consist 
primarily  of  complicated  sentence  structure,  such  as  self-embedded 
sentences.  Then  comes  semantic  analysis ,  in  which  the  word 
meanings  are  associated  with  each  other  as  specified  by  the 
sentence  syntax  and  the  previous  context,  to  produce  a 
representation  of  the  meaning  of  the  sentence  in  terms  of  how  it 
relates  to  the  previous  material.  Problems  of  ambiguity, 
coherence,  reference,  and  global  organization  can  appear  at  this 
stage.  The  final  stages  are  concerned  with  pragmatic  and 
functional  analysis,  in  which  the  concern  is  what  the  point  of  the 
mate' rial"  is,  and  how  it  is  related  to  the  reader's  situation  or 
task.  Comprehensibility  problems  can  appear  at  this  level  if  the 
large-scale  organization  of  the  material  is  poor,  or  it  fails  to 
inform  the  reader  of  what  content  is  relevant  to  the  task  at  hand. 

What  is  feasible? 

Our  present  knowledge  and  technology  is  adequate  to  allow  an 
advanced  writing  aid  system  to  identify  some  of  the 
comprehensibility  problems  that  can  occur  at  each  stage  of  the 
comprehension  process.  CRES  handles  the  problems  in  the  word 
identification  stage  by  identifying  unusual  and  unknown  words. 
CRES  also  detects  some  of  the  problems  in  the  syntax  stage  by 
finding  some  forms  of  bad  sentence  structure.  Going  beyond 
systems  like  CRES  and  WWB,  into  a  full  analysis  of 
comprehensibility  problems  in  the  semantic  and  later  stages,  would 
require  heavy  use  of  general  knowledge  and  also  the  relevant 
domain-specific  knowledge  such  as  electronics  theory.  This  is 
well  beyond  the  current  state  of  the  art  in  artificial 
intelligence,  and  can  be  ruled  out  of  this  discussion  of  advanced 
writing  aids.  However,  there  is  a  certain  set  of  issues  in  the 
semantic,  pragmatic,  and  functional  stages  that  are  within  the 
reach  of  current  artificial  intelligence  techniques,  and  are  also 
very  important  to  comprehensibility.  For  example,  determining 
whether  the  material  is  coherent  in  certain  ways  is  quite  simple. 


The  key  idea  in  the  development  of  advanced  writing  aids  is 
the  principle  that  the  system  does  not  have  to  be  able  to  handle 
input  as  complex  and  obscure  as  human  readers  can.  The  goal  of  an 
advanced  writing  aid  is  only  to  identify  when  it  is  difficult  to 
process  the  text;  the  system  does  not  have  to  be  able  to  overcome 
all  of  the  difficulties,  nor  fully  understand  the  input,  any  more 
than  a  poor  reader  would.  This  principle,  which  could  be 
flippantly  termed  artificial  stupidity,  is  very  important,  because 
this  is  why  advanced  writing  aids  are  now  possible,  even  though 
many  fundamental  problems  in  both  the  artificial  intelligence  and 
cognitive  psychology  of  language  processing  are  not  yet  solved. 

An  advanced  writing  aid  would  resemble  a  model  of 
comprehension,  or  an  Al-based  natural  language  processor,  but 
there  are  some  important  differences.  The  similarities  are  that 
the  system  consists  of  a  parsing  process,  a  set  of  rules  for 
integrating  sentences  together,  and  the  use  of  a  working  memory  to 
keep  track  of  the  current  topics  and  the  structures  being  built. 
However,  the  difference  is  that  the  level  of  comprehension 
required  can  be  quite  shallow,  as  argued  above,  and  thus  little  or 
no  general  knowledge  is  required.  For  example,  the  system  can 
identify  and  parse  noun  phrases  and  maintain  a  record  of  which 
referents  have  been  mentioned,  so  that  it  can  determine  whether  a 
new  noun  phrase  can  be  easily  matched  against  a  previous 
reference.  Likewise,  the  system  could  keep  track  of  the  current 
topic  by  detecting  some  of  the  simple  patterns  in  which  topics  are 
changed  in  the  course  of  a  passage. 

Thus,  such  a  system  should  be  feasible  simply  because  it  uses 
only  a  subset  of  what  is  currently  known  about  natural  language 
processing,  both  in  artificial  intelligence  and  in  cognitive 
psychology.  Rather  than  attempting  to  be  a  complete  comprehension 
system,  this  system  will  only  capture  certain  aspects  of 
comprehension  and  then  signal  when  some  relatively  simple  rules 
have  been  violated. 

A  demonstration  system 

A  demonstration  system  along  these  lines  will  be  briefly 
described.  The  demonstration  system  uses  an  ATN  parser  (see 
Woods,  1970)  borrowed  directly  from  the  comprehension  simulation 
model  described  in  Kieras  (1983)*  This  parser  is  severely 
limited,  but  is  able  to  handle  many  complex  constructions.  The 
ATN  parser  works  in  conjunction  with  a  production  system 
interpreter  borrowed  directly  from  another  comprehension 
simulation  model,  this  one  described  in  Kieras  (1982).  The  system 
maintains  a  semantic  network  data  base,  also  borrowed  from  earlier 
models.  It  should  be  kept  in  mind  that  this  is  not  intended  to  be 
a  usable  system,  or  even  a  prototype  of  one.  Rather,  it 
demonstrates  how  a  simple  reorganization  of  components  from 
existing  simulation  models  of  comprehension  can  be  applied  to  the 
writing  aid  problem.  A  truly  usable  system  will  require 
programming  that  is  more  efficient,  even  if  less  psychologically 
relevant. 


The  demonstration  system  is  set  up  on  a  Xerox  1108  LISP 
machine,  using  several  display  windows.  One  window  contains  a 
list  of  the  input  sentences,  followed  by  the  comments  generated 
for  each  one.  Tables  2  and  3  are  copies  of  these  windows. 
Another  window  contains  a  list  of  the  referents  currently  defined 
by  the  passage;  the  contents  of  this  window  appear  at  the  end  of 
Tables  2  and  3*  A  detail  important  to  understanding  the  examples 
is  that  the  names  of  the  referent  nodes  are  arbitrary  symbols  like 
H0409.  Other  windows  display  the  state  of  the  processing  for 
development  purposes. 

Table  2  shows  a  series  of  poorly  written  sentences,  while  the 
passage  in  Table  3  contains  much  of  the  same  content,  but  better 
expressed.  The  materials  are  based  on  the  simulated  technical 
manuals  described  in  Kieras  (in  preparation).  The  system 
implements  simplified  forms  of  the  comprehensibility  rules  shown 
in  Table  4. 

The  operation  of  the  demonstration  system  can  be  briefly 
summarized.  The  basic  principle  of  operations  is  the  processing 
of  given  and  new  information  (see  Clark  &  Haviland,  1977),  similar 
to  the  model  described  in  Kieras  (1981).  The  principle  is  that 
each  sentence  in  a  text  will  contain  new  information  about 
referents  that  are  given  in  the  context  of  the  preceding 
sentences.  The  representations  for  the  given  referents  are 
located,  and  the  new  information  added  to  them. 

The  system  processes  the  input  one  sentence  at  a  time, 
maintaining  a  representation  of  the  content  of  the  previously 
processed  sentences.  The  representation  at  this  time  is  based  on 
Anderson's  (1976)  ACT  representation,  in  which  a  referent  is 
defined  by  a  piece  of  semantic  network  attached  to  a  node  that 
represents  the  referent.  In  order  to  allow  highly  efficient 
referent  searches  based  on  surface  structure,  a  word  string  that 
consists  of  a  short  adjective-noun  phrase  is  stored  as  the  simple 
referential  form  of  the  referent.  The  system  constructs  the  ACT 
network  representation  for  the  sentence  content  and  annotates  it 
with  tags  about  the  syntactic  role  played  in  the  surface  sentence 
by  portions  of  the  network  structure.  For  example,  the 
proposition  node  corresponding  to  the  head  noun  of  a  noun  phrase, 
or  to  the  main  sentence  clause,  is  tagged  as  such.  The  referent 
nodes  are  tagged  to  show  whether  they  appeared  as  surface  subject 
or  surface  object.  This  technique  allows  production  rules  to 
easily  consider  both  the  syntactic  and  semantic  features  of  the 
input . 

While  processing  the  input  sentence,  the  parser  identifies 
each  noun  phrase,  and  calls  the  production  system  interpreter  to 
resolve  the  reference.  The  production  rules  attempt  to  match  the 
noun  phrase  to  a  referent  already  mentioned,  in  order  to  determine 
whether  the  noun  phrase  refers  to  a  textualiy  old  referent,  or  to 
a  textualiy  new  referent,  and  tag  the  structure  as  such.  The 
search  for  a  previous  referent  is  a  two-stage  process.  First,  the 
simple  referential  form,  if  any,  for  the  surface  noun  phrase  is 
matched  against  the  simple  referential  forms  for  all  defined 


Table  2 


Sample  Commentary  on  Poor  Text 


HEADING:  THE  PHASER  SYSTEN 
NEN  REFERENT  defined:  R0099 
Nett  Discourse  Topic:  R0099 

THE  SYSTEM  CONTAINS  AN  ENERGY  BOOSTER  THAT  IS  POWERED  BY  THE  HAItt  SHIPBOARD  POKER  SUPPLY 
Old  referent  found:  R0099 
INCONSISTENT  terainology  •  use:  PHASER  SYSTEN 
NEN  REFERENT  defined:  R0102 

Usinq  last  two  words  for  sitple  reference:  NAIN  SHIPBOARD  POKER  SUPPLY 
EXCESSIVE  content  in  noun  phrase 
NEN  REFERENT  defined:  R0107 

COMPLEXITY  inposed  -  no  siaple  fora  for  na«  referent 

THE  BOOSTER  RECEIVES  NIGH  VOLTAGE  FRON  THE  NAIN  SUPPLY 
Old  referent  found:  R0107 

COMPLEXITY  unnecessary  —  define  siaple  fora:  BOOSTER 

NEN  REFERENT  defined:  R0U6 

Old  referent  found:  R0102 

INCONSISTENT  terainology  -  use:  POKER  SUPPLY 

Sentence  subject  changes  the  topic  R0099  R0107 

Topic  change  is  ok  since  this  is  a  topic  chain  R0099  R0107 

AN  ENERGY  ACCUMULATOR  IS  ALSO  USED  BY  THE  SYSTEM 
NEN  REFERENT  defined:  R0123 
Old  referent  found:  R0099 
INCONSISTENT  terainoloqy  -  use:  PHASER  SYSTEN 
INCOHERENT:  New  referent  in  sentence  subject  changes  the  topic  R012J 
BAD  PASSIVE  -  subject  is  not  the  topic  R0107 

THE  ACCUMULATOR  IS  ENERGIZED  BY  THE  BOOSTER 
Old  referent  found:  R0123 

INCONSISTENT  terainology  -  use:  ENER6Y  ACCUMULATOR 
Old  referent  found:  R0107 

COMPLEXITY  unnecessary  —  define  siaple  fora:  BOOSTER 
Passive  sentence  is  ok  because  it  preserves  the  topic  R0I23 

***  REFERENT  LIST  «* 

(P0099  (PHASER  SYSTEN)  ((PHASER  (SYSTEM  ((CONTAIN)  ((CONTAIN  R0107)  ((USE  R0123))l 
(R0102  (PONER  SUPPLY)  ((MAIN  (SHIPBOARD  (POKER  (SUPPY  ((POKER  R0I07))) 

(R0107  NIL  ((ENERGY  (BOOSTER  ((RECEIVE  R011A)  I  (ENERGIZE  R0123))) 

(R0116  (HIGH  VOLTAGE)  ((HIGH  (VOLTAGE)) 

(R0123  (ENERGY  ACCUMULATOR)  ((ENERGY  (ACCUMULATOR)) 


Table  3 

Sample  Commentary  on  Good  Text 


HEADING:  THE  PHASER  SYSTEM 
NEK  REFERENT  defined:  R0131 
New  Discourse  Topic:  R0131 

THE  PHASER  SYSTEM  CONTAINS  AN  ENERGY  BOOSTER  AND  AN  ENERGY  ACCUMULATOR 
Staple  reference  to:  R0131 
Old  refer.-nt  found:  R0131 
HEN  REFERENT  defined:  R0134 
NEK  REFERENT  defined:  R0137 

THE  ENER6Y  BOOSTER  RECEIVES  HIGH  VOLTAGE  FROM  THE  PONER  SUPPLY 
Siople  reference  to:  R0134 
Old  referent  found:  R0134 
NEN  REFERENT  defined:  R0144 
NEN  REFERENT  defined:  R0147 
Sentence  subject  changes  the  topic  R4131  R0134 

THE  ENERGY  BOOSTER  ENERGIZES  THE  ENERGY  ACCUMULATOR 
Siople  reference  to:  R0134 
Old  referent  found:  R0134 
Siople  reference  to:  R0137 
Old  referent  found:  R0137 

THE  ENER6Y  BOOSTER  SHOULD  BE  MONITORED  CAREFULLY  BY  THE  OPERATOR  OF  THE  PHASER  SYSTEM 
Siople  reference  to:  R0134 
Old  referent  found:  R0134 
Siople  reference  to:  R0131 
Old  referent  found:  R0131 
NEN  REFERENT  defined:  R0155 

COMPLEXITY  ioposed  -  no  siople  fore  for  nen  referent 
Passive  sentence  is  ok  because  it  preserves  the  topic  R0134 

»**  REFERENT  LIST  »*♦ 

(R0131  (PHASER  SYSTEM!  ((PHASER  ItSYSTEM  ((CONTAIN  R0134>  ((CONTAIN  R0137)  ((POSSESS  R0155DI 
IR4134  (ENERGY  BOOSTER)  ((ENERGY  (BOOSTER  ((RECEIVE  R0144)  ((ENERGIZE  R0137D) 

(R01J7  (ENERGY  ACCUMULATOR!  ((ENERGY  (ACCUMULATOR)) 

(AO 144  (HIGH  VOLTAGE)  ((HIGH  (VOLTAGE) l 
iR0t47  (PONER  SUPPLY)  ((PONER  (SUPPY)I 
R0155  NIL  ((OPERATOR  ((MONITOR  R4134))) 


Table  4 

Comprehensibility  Rules  in  the  Demonstration  System 


Reference 

1 .  Referents  should  be  referred  to  by  an  unambiguous  and  short 
(2-3  words)  simple  referential  form  that  is  used  consistently 
throughout  the  document. 

2.  The  identity  of  a  referent  must  be  trivially  determinable 
from  the  referencing  noun  phrase;  no  inference  should  be 
required . 

3*  The  pronoun  i_t  should  refer  only  to  the  subject  referent  of 
the  preceding  sentence. 

4-  Propositional  pronouns  should  refer  only  to  the  main 
proposition  of  the  preceding  sentence. 

Sentence  structure 


1 .  Relative  clauses  must  have  a  relative  pronoun  (which ,  that) 
unless  the  main  proposition  of  the  clause  is  based  on  a 
preposition. 

2.  A  noun  phrase  should  contain  no  more  than  about  5 
propositions . 

Textual  Coherence 

1 .  Textually  new  referents  and  propositions  should  appear  only 
in  clause  predicates. 

2.  Material  should  be  grouped  so  that  the  following  coherence 

rules  can  be  followed:  The  subject  noun  phrase  of  each 

sentence  should  refer  either  to  the  subject  referent  of  the 
previous  sentence,  or  to  a  textually  new  referent  introduced  in 
the  predicate  of  the  previous  sentence  (chained  construction), 
or  the  discourse  topic,  defined  as  the  subject  of  a  heading  or 
the  first  sentence  of  the  passage. 

3-  Although  passive  constructions  should  be  avoided,  a  passive 
construction  that  is  required  to  maintain  coherence  should  be 
used  rather  than  the  active  construction . 

Textual  Organization 

1 .  New  referents  should  be  introduced  in  simple  referential 
form,  and  additional  information  added  in  later  sentences. 


referents.  If  a  match  is  found  at  this  point,  the  search  is  over 
quickly.  Otherwise,  the  system  must  analyze  the  semantic 
representation,  a  much  slower  process.  This  strategy  corresponds 
to  the  hypothesis  that  if  the  surface  form  of  a  noun  phrase  is 
identical  to  the  earlier  surface  form  of  the  intended  referent, 
processing  will  be  much  faster. 

If  the  surface  match  fails,  the  second  stage  in  the  search  is 
done  by  a  set  of  production  rules  which  matches  the  semantic 
content  of  the  noun  phrase,  one  proposition  at  a  time,  against  the 
network  representation  for  the  previous  passage  content,  and 
attempts  to  find  the  node  whose  propositions  all  match.  If  more 
than  one  such  node  is  found,  then  a  comment  is  made  that  the 
reference  is  ambiguous.  If  the  noun  phrase  is  successfully 
matched,  then  the  structure  representing  the  noun  phrase  is 
discarded,  and  replaced  with  the  node  for  the  referent. 

If  no  match  is  found,  the  system  comments  that  a  new  referent 
is  defined,  and  builds  the  corresponding  network  structure.  An 
indefinite  noun  phrase  is  always  treated  as  defining  a  new 
referent.  If  the  noun  phrase  for  a  new  referent  consists  of  a 
short  string  consisting  only  of  one  or  two  adjectives  followed  by 
a  noun,  it  is  used  as  the  simple  referential  form  for  the 
referent.  Thus,  an  emitter  bias  resistor  can  later  be  efficiently 
referred  to  as  the  emitter  bias  resistor . 

The  final  result  of  parsing  and  reference  resolution  is  a 
piece  of  semantic  network  structure  that  represents  the  textually 
new  content  of  the  sentence.  A  set  of  production  rules  is  then 
applied,  which  comment  on  the  relation  of  the  surface  form 
associated  with  the  new  structure  to  the  previous  passage  content. 
Thus,  if  the  surface  subject  is  a  new  referent,  the  system  will 
comment  on  the  loss  of  coherence.  If  the  sentence  was  passive, 
but  the  surface  subject  was  the  current  topic  of  discourse,  then  a 
comment  is  made  that  the  passive  is  present,  but  acceptable. 
After  completing  the  commentary,  the  content  of  the  sentence  is 
added  to  the  representation  of  the  passage  content,  and  processing 
on  the  next  sentence  is  begun. 

Although  this  system,  as  emphasized  above,  is  not  actually 
usable,  it  does  illustrate  the  soundness  and  feasability  of  the 
basic  concept,  and  how  existing  techniques  from  artificial 
intelligence  and  cognitive  simulation  can  be  easily  applied.  A 
system  tailored  to  the  task,  using  more  sophisticated  algorithms, 
such  as  a  Marcus  parser  (1980),  should  produce  fully  usable 
performance  in  the  near  future. 


CONCLUSION 


Thu  important  practical  question  is  whether  such  a  system 
would  actually  be  useful  if  it  were  built.  One  reason  for  being 
pessimistic  is  the  historical  intransigence  of  the  general  problem 
of  improving  the  quality  of  writing  in  the  real  world.  But  a  more 
serious  problem  is  the  ambiguous  results  of  attempts  to 
demonstrate  scientifically  how  writing  quality  can  be  improved. 
Probably  the  best  example  is  the  work  done  by  Duffy  and  his 
associates  (Duffy  &  Kabance,  1982;  Duffy,  Curran,  &  Sass,  1983), 
who  failed  to  obtain  performance  improvements  even  with  drastic 
changes  to  technical  text.  In  contrast,  experiments  reported  in 
Kieras  (in  preparation)  demonstrate  strong  performance  gains 
resulting  from  improving  the  quality  of  a  simulated  technical 
manual.  The  experimental  task  was  a  realistic  model  of  the 
situations  in  which  technical  manuals  for  equipment  are  used.  The 
manual  was  improved  by  by  correcting  problems  that  are  the  type  of 
comprehensibility  problem  that  an  advanced  writing  aid  system 
could  detect.  However,  there  are  some  questionable  aspects  to 
these  results,  which  further  research  will  clarify. 

The  point  is  that  reading  comprehension  in  task  situations 
this  complex  has  not  been  studied  very  much,  and  there  are  many 
unresolved  methodological  and  empirical  questions.  Therefore, 
before  major  efforts  are  made  to  put  a  new  writing  aid  system  into 
the  field,  it  will  be  essential  to  conduct  the  appropriate 
evaluation  studies. 
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